Automatic Transcription of English Broadcast News
نویسندگان
چکیده
In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of \found" speech in radio and television broadcasts without any additional side information (e.g. speaking style, background conditions). The system was derived from the Philips continuous mixture density crossword HMM system, using MFCC features and Laplacian densities. A segmentation was performed to obtain sentence-like partitions of the broadcasts. Using data-driven clustering, the obtained segments were grouped into clusters with similar acoustic conditions for adaptation purposes. Gender independent wordinternal and crossword triphone models were trained on 70 hours of the HUB4 training data. No focus condition speci c training was applied. Channel and speaker normalization was done by mean and variance normalization as well as VTN and MLLR. The transcription was produced by an adaptive multiple pass decoder starting with phrase-bigram decoding using word-internal triphones and nishing with a phrasetrigram decoding using MLLR-adapted crossword models.
منابع مشابه
Japanese broadcast news transcription
In this paper, we describe the on-going development of a Japanese Broadcast News Transcription system at BBN Technologies. This is a collaboration between BBN and NHK to use automatic speech recognition technology to provide live closed caption for NHK’s TV news programs in Japan. We describe what the NHK Broadcast News Corpus comprises and how we adopted transcription technology developed for ...
متن کاملThe L2F Broadcast News Speech Recognition System
Broadcast news play an important role in our lives providing access to news, information and entertainment. The existence of an automatic transcription is an important medium that not only can provide subtitles for inclusion of people with special needs or be an advantage on noisy and populated environments, but also because it enables data search and retrieve capabilities over the multimedia s...
متن کاملThe need to create a media block for the convergence of overseas news networks
As a general diplomacy arm of the Islamic Republic of Iran, VoSiMa has extensive activities in international broadcasting of its radio and television programs. These programs are broadcast in different languages, such as English, French, Azeri, Arabic, and ... for regional and transnational audiences. The large volume of the organization's international activities is in the form of news and new...
متن کاملReal-time rich-content transcription of Chinese broadcast news
This paper describes the recent development of an Audio Indexing System for Chinese (Mandarin) broadcast news. Key issues of the three major components: automatic speech recognition, speaker identification and named entity extraction are addressed. The Chinese-language-specific challenges are discussed and our solutions are described. The recognition accuracy of the final system is comparable t...
متن کاملResource development and experiments in automatic south african broadcast news transcription
We present a description of the development and evaluation of a first South African broadcast news transcription system. We describe a number of speech resources which have been collected in the resource-scarce South African environment for system development purposes: a 20 hour corpus of South African English (SAE) broadcast news; a 109M word corpus of South African newspaper text collected fo...
متن کاملThe 300k LIMSI German broadcast news transcription system
This paper describes improvements to the existing LIMSI German broadcast news transcription system, especially its extension from a 65k vocabulary to 300k words. Automatic speech recognition for German is more problematic than for a language such as English in that the inflectional morphology of German and its highly generative process of compounding lead to many more out of vocabulary words fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998